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Abstract 

This article focuses on parameter estimation of multi-levels nonlinear mixed effects mod- 
els (MNLMEMs). These models are used to analyze data presenting multiple hierarchical 
levels of grouping (cluster data, clinical trials with several observation periods,...). The 
variability of the individual parameters of the regression function is thus decomposed as 
a between-subject variability and higher levels of variability (for example within-subject 
variability). We propose maximum likelihood estimates of parameters of those MNLMEMs 
with two levels of random effects, using an extension of the SAEM-MCMC algorithm. The 
extended SAEM algorithm is split into an explicit direct EM algorithm and a stochastic 
EM part. Compared to the original algorithm, additional sufficient statistics have to be 
approximated by relying on the conditional distribution of the second level of random ef- 
fects. This estimation method is evaluated on pharmacokinetic cross-over simulated trials, 
mimicking theophyllin concentration data. Results obtained on those datasets with either 
the SAEM algorithm or the FOCE algorithm (implemented in the nlme function of R soft- 
ware) are compared: biases and RMSEs of almost all the SAEM estimates are smaller than 
the FOCE ones. Finally, we apply the extended SAEM algorithm to analyze the pharma- 
cokinetic interaction of tenofovir on atazanavir, a novel protease inhibitor, from the ANRS 
107-Puzzle 2 study. A significant decrease of the area under the curve of atazanavir is found 
in patients receiving both treatments. 
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1 Introduction 

The use of non-linear mixed effects models (NLMEMs) increases in several fields such as agron- 
omy, forestry, clinical trials, population pharmacokinetics (PK) and pharmacodynamics or viral 
dynamics to model longitudinal data. In some settings, data can present multiple hierarchical 
levels of grouping, leading to multiple nested levels of variability. For instance, we may study 
patients that are grouped in medical services that are themselves grouped into hospitals. In this 
article, we call multilevel non-linear mixed effects models (MNLMEMs) the models that describe 
such data. MNLMEMs represent a natural extension of models with single variability level, and 
they have recently been subject to a great deal of attention in statistical literature. In the field 
of forestry, Hall and Clutter (|lj) analyze longitudinal measures of yield and growth that are mea- 
sured on each tree within a plot. In the field of agronomy, Rekaya et al. (2j) consider milk yield 
data where each cow is observed longitudinally during its first three lactations. Jaffrezic et al. 
( jj) perform genetic analyses of growth measurements in beef cattle acknowledging the fact that 
several cows come from the same sire. Another example is population PK, where concentration 
measurements may be taken with several patients over several distinct time intervals, that are 
often named periods or occasions. That grouping pattern is used for instance in PK cross-over 
trials. 

In NLMEMs with only one level of variability, often corresponding to between-subject vari- 
ability, the analysis results in the estimation of the fixed effects parameters and of the between- 
subject variability of the parameters, also called inter-subject variability. When there is more 
than one level of grouping, the higher levels of variability can be estimated. In the specific case 
where the second level of grouping corresponds to multiple periods of measurement, this variabil- 
ity is called within-subject variability (or intra-subject variability, or inter-occasion variability), 
and corresponds to the variation of the individual parameters across the different study periods 
or units. In the context of pharmacokinetics, Karlsson et al. (4j) demonstrate the importance 
of modeling this second level of variability in two-levels NLMEMs. They show that neglecting 
it resulted in biased estimates for the fixed effects. 
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The parameter estimation of NLMEMs is not trivial because the likelihood of NLMEMs 
cannot be expressed in a closed form due to the non-linearity of the regression function in the 
random effects. Therefore, several estimation methods have been proposed. The First Order 
Conditional Estimates (FOCE) algorithm performs a first order linearization of the regression 
function with respect to the random effects ||]; 6). The implementation of the FOCE algorithm 
in the NONMEM software and in the nlme function of Splus and R enables the estimation of both 
between- and within-subject variabilities. From our practice, the main drawback of this method 
is however that it does not always converge when one estimates simultaneously the between- 
and the within-subject variabilities on several parameters. Furthermore, this linearization-based 
method cannot be considered as fully established in theory. For instance, Vonesh (|7|) and Ge 
et al. (3) give examples of specific designs resulting in inconsistent estimates, such as when the 
number of observations per subject does not increase faster than the number of subjects or when 
the variability of random effects is too large. 

Several estimation methods have been proposed as alternatives to linearization algorithms. 
A common method to handle numerical integration is the adaptative Gaussian quadrature 
(AGQ) method. An estimation algorithm of NLMEM parameters based on this classical AGQ 
method has been proposed by Pinheiro and Bates (9) and is implemented in the SAS procedure 
NLMIXED fl. However, the AGQ method requires a sufficiently large number of quadrature 
points implying an often slow convergence with very high computational time. Furthermore, 
two-levels NLMEM can be implemented in the NLMIXED procedure, but to our knowledge, 
the convergence is difficult to obtain in practice (|3j). Improvements upon this method are thus 
needed. The second alternative to linearization is the use of the Expectation-Maximization (EM) 



algorithm ijllh in order to estimate models with missing or non-observed data such as random 



effects. Because of the nonlinearity of the model, stochastic versions of the EM algorithm have 



been proposed. Wei et al. (12); Walker lil3h and Wu (|14f ) propose MCEM algorithms, with a 



Monte-Carlo approximation of the E-step. However the MCEM algorithm may have computa- 
tional problems (i.e slow or even non convergence). As an alternative to address computational 

nri 

problems, a stochastic approximation version of EM (SAEM) has been proposed in l|15l : \lw . 
which requires the simulation of only one realization of the missing data for each iteration, sub- 
stantially reducing the computation time. Kuhn and Lavielle (|16T I propose to combine the SAEM 
algorithm with a Monte-Carlo Markov Chain (MCMC) procedure adapted to the NLMEMs, and 
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prove that the resulting estimates are convergent and consistent. 

To date, none of the EM-based algorithms are directly applicable to the case of multilevel 
NLMEMs and have to be adapted. The objective of this paper is to extend the SAEM algo- 
rithm to MNLMEMs with two levels of variability: both E and M steps need to be adapted 
to integrate higher levels of random effects. We also propose estimates of the likelihood and of 
the Fisher information matrix. We evaluate this algorithm on a PK example, more precisely a 
two-periods one-sequence cross-over trials simulated mimicking theophyllin concentration data 
(jj). We also apply the SAEM algorithm to the PK interaction of two HIV molecules (tenofovir 
and atazanavir) from a PK substudy of the ANRS 107-Puzzle 2 trial. After describing the 
model and notations (section [2]) , section [3] describes the SAEM algorithm. Section [4] reports 
the simulation study and its results. In Section [U we study the PK interaction of tenofovir on 
atazanavir in HIV patients. Section [6] concludes the article with some discussion. 

2 Models 

Let us denote y%jk the observation in unit k (k = 1, . . . , K) for subject i (i = 1, • • • , n) and at 
time tijk (j = 1, • • • ,n.ik). For instance, the different units can be the different periods in the 
case of PK trials, or the different parents in the case of genetic analyses. We assume, as a known 
fact, two nonlinear functions / and g such that the two-levels non-linear mixed effects model 
linking observations to sampling times can be written as: 

IJijk — f(tijk,(t>ik) + g(tijk,4 l ik)£ijk, 
£ijk ~ A/"(0,f7 2 ), 

where 4>ik is the p-vector of the parameters of subject i for unit k and e^k is the measurement 
error. We hypothesize that the errors Sijk given 4>ik are mutually independent. We assume that 
the individual parameters 4>ik are random vectors and that for each unit k, 4>ik can be broken 
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into: 



fak = £t + Pk + h + Cik, 
h ~ JV(0,fi), 
c lfc - JV(0,tf), 



(1) 



where £t + At is the mean value for unit k, bi is the random effect of size p of subject i, and 
Cife is the random effect of size p of subject i and unit k. To ensure the identifiability of the 
parameters, we assume that (3i = 0, ie \i is the mean of the first unit and (3k represents the 
difference (or effect) of the fcth unit in comparison to this first unit. The random effects (bi) 
and (cik) are assumed to be mutually independent. The total variance of the parameters is 
thus broken into a between-subject variance ft and a within-subject variance \D r . Finally, the 
individual parameters pK- vector fa — (fax, . . . , fax) of subject i is distributed with a Gaussian 
distribution with mean vector (/x, /i + 02, ■ ■ ■ , M + Pk) and a pK x pK covariance matrix V equal 
to 

/ n + y Q ... fi 
n + * '•. : 
: '■. "-. n 

\ n ... n + * j 

Let 6 = (/x, P, fl, ty, a 2 ), the vector of all the parameters of the model where j3 denotes the 
vector of unit effect /3 = (fti, . . . ,0k)- The aim of this paper is to propose an estimation of 9 
by maximizing the likelihood of the observations y = (yijk)ijk- 

Let us denote bi := fi + bi. The likelihood of y can be written as: 



\ 



p(y;6) = Jp(y,fab;6)d(fab) 



where p(y, fa b; 6) is the likelihood of the complete data (y, 4>, b), with <j> = (fak)i=i,...,n,k=i,...,K 
and b = (b\, . . . , b n ). Because of the nonlinearity of the regression function / with respect to 
the random effects fak, the likelihood has no closed form. Therefore, the maximization of the 
likelihood in 9, 9 £ 9, is a complex problem. We propose to use a stochastic version of the EM 
algorithm, which is presented in detail in the next section. 
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3 Estimation algorithm 



3.1 The SAEM algorithm 



The EM algorithm introduced by Dempster et al. ijllh is a classical approach to estimate param- 
eters of models with non-observed or incomplete data. In two- levels NLMEMs, the non-observed 
data are the individual parameters (4>, b) and the complete data of the model is (y, <j>, b). Let us 
denote L c (y,(f>,b;9) = logp(y,<j),b;0) the log-likelihood of the complete data. The principle of 
the iterative EM algorithm is to maximize the function Q(0\0') = E(L c (y, b; 0)\y, 0') where 
the expectation is the conditional expectation under the posterior distribution p(<fi, b\y; 6'), the 
maximization of Q being often easier than the direct maximization of the observed data log- 
likelihood. Each iteration of the EM algorithm is computed through two steps: the Expectation 
step (E-step) and the Maximization step (M-step). At the I th iteration of the algorithm, the E 
step is the evaluation of Q(9 \ 9?), while the M step updates 0i by maximizing Q(9 | 9e). 

Let us show that the function Q can be reduced in the case of a MNLMEM. First, let us 
note that as p(b\y, <f>; 9) = p(b\4>; 0), by application of the Bayes theorem we have: 

n 

p{ct>, b\y; 0) = p(b\<f>; 0)p(<f>\y; 0) = nK&il&; (2) 

8=1 

Second, through the linearity of the individual parameters model in equation JTJ, the poste- 
rior distribution p(bi\4>i;0) of the ith subject is explicit: p(bi\4>f,9) is a Gaussian distribution 
Af(m(<pi, 0), V(0)) of mean and variance equal to: 



v{0) ^-^fak-flo + n-vj 



m(4>i,0) = V{9) \ V~ l lJ<t>ik- <'/.) + <>■ V I ■ (3) 

v{0) = (fr 1 + K^- 1 y 1 . 

Due to the factorization given in equation J2]), function Q can be rewritten as: 
QW) = I ( I L c (y,cj ) ,b ] 9)p(b\^9 / )di]p(cl ) \y;9')d^ 



Because of the explicit posterior distribution of random effects b given in equation |(3j), the 
computation of this conditional expectation can be split into two parts : the computation of 
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the integral with respect to the posterior distribution of b which has an analytical form, and 
the computation of the integral with respect to the posterior distribution of <fi which has no 
analytical form. Therefore the EM algorithm is split into an explicit direct EM algorithm for 
the computation of the first integral and the use of a stochastic version of the EM algorithm for 
the computation of the second integral. 

Let us detail the explicit computation of the first integral, denoted by R(y, 4>, 9, 9') 

R(y,<t>,6,6') = J L c (y,cj ) ,b;9)p(b\<f > ;9')db. 

This integral has an analytical form. Indeed, the complete log likelihood L c (y, cj>, b;9) is equal 
to 

K n riik -. K n n ik , , ,n 2 

My,w) = -\ EE E ^(2,aV(u jk M)-l EE E^fvff^r 1 " 

Z k=l i=l j=l k=l i=l j=l 9 ^ k ^ lk > 



nK 1 

— log(27rdet ) - - E E(^ fc - bi ~ M^'H^ik -k- (3 k ) 

k=l i=l 

1 " 

- log(27r det n) - - E(^ - - n). 



i=l 



As the posterior distribution p{b\<f>; 9) is known (equation [3|) , R(y, (j), 9, 9') is equal to 

R(y, 0, 0, 6') = -\ E log^Vfe 4>ik))\ E ^^ffi^^ (4) 
log(27r det *) - y*" 1 / 2 ^)*- 1 ^ - I E(fe - m(&, 0') - ft)**" 1 ^ - m(&, 0') - &) 

n 

™ log(2^det fi) - -fi-VVCflOfi- 1 / 3 - 2 E( m (^'^') - M)*«~V(0i> O - M)- 



i=l 



Therefore Q is reduced to the computation of the second integral under the posterior distri- 
bution p(4>\y; 9) as follows: 

Q(9\9') = J R(y, 0, 9, 9')p^\y; #')#• (5) 

Given the non-linearity of function / with respect to <ft, the posterior distribution p(<p\y; 9') has 
no closed form and the function Q defined by ^ is intractable. Thus we propose to use the 
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stochastic version SAEM of the EM algorithm proposed by Delyon et al. (|15T I which evaluates 
the integral Q by a stochastic approximation procedure. 

Let us detail this SAEM algorithm in the case of two-levels NLMEMs. Let us note that the 
quantity R(y, 4>, 8, 8') belongs to the regular curved exponential family, i.e, it can be written as 

R(y, 0, 8, 8') = -A(0) + (S(y, </>, 9'), *(0)>, (6) 

where (.,.) is the scalar product, A and $ are two functions twice continuously differentiable 
on 6 and S(y, 4>, 8') is known as the minimal sufficient statistics of the complete model. Those 
statistics are detailed later. In this case, the Q function is reduced to 

Q(0\0') = -A(9) + ( \J s(y,<M'M4>|y;0')#J , m ), 

In this case, at the £th iteration, the SAEM algorithm proceeds as follows: 

• Simulation step: simulation of the missing data (<f>i)i under the conditional distribution 

p(4>\r,0i) 



Stochastic approximation step: computation of a stochastic approximation se+i ofE 



S(y,< 



J S(y, <j>, di)p(<f)\y; 8e)d<j), using ("fe)e>o, a sequence of positive numbers decreasing to 0: 

se+i = Si + jt(S(y,(f> w ,d e ) - s e ). 

Maximization step: update of the estimate 8i + \: 



h+x = argmax(-A(0)) + ( S/+1 , $(0)))) . 



The sufficient statistics of the complete model Q evaluated during the S A step of the SAEM 
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algorithm are as follows: 

si,i,e+i = si t i t e + 71 ( — si t j t j \ , i = l,...,N, 

S2,k,l+l = s 2,k,i + ll ^ik ~ S2 ,M , k = l,...,K, 



S3./+1 = S3,£ +11 I ^ rn(<f>\ , 9ifm{4>[ ,8 e ) - s 3 ., 
\i=i 

K n 



S4,£+l — S4 



fc=l 1=1 

2 



j/wk - f(Ujk,<t>ik) 



S5J , 



The expression of the M step is obtained by derivation of equation ([4]) . The parameter estimates 
are as follows: 



Mc+i 



\ U i=l fe=l / 



/?M+i = S2,M+1 - W+i, for A; = 2, . . . ,K, 

n 



n 

1 K 

$ £+1 = v{e l ) + ^--Y,0k,t + x) t dk,i+u 

k=l 

S5,£+l 



Z^i=l Z^fc=l n ifc 



Comparing with the classic SAEM algorithm for single-level NLMEMs, the extension of 
SAEM to the two-levels NLMEMs is finally split into an explicit EM algorithm and a stochastic 
EM part. Furthermore, it requires the computation of two intermediate quantities (the condi- 
tional expectations m((j)i, 9) and variance V{9) of the between-subject random effects parameters 
bi) as well as two additional sufficient statistics (S3 and S4), functions of m((j)i, 9). The M-step 
differs from the one of the classic SAEM for single-level NLMEMs, especially for the estimation 
of the variance matrix Q and \E" which uses the additional quantity V(9) . 

As proved by (Q; \v\ ) , the convergence of the SAEM algorithm is ensured under the following 
assumption: 
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Assumption (Al): 

1. Functions A and <E> are twice continuously differentiable on Q. 

2. The log-likelihood logp(y;0) is d times differentiable on O, where d is the dimension of 

S(y,<f>,6'). 

3. Function s defined as 

8(9,6') = J S(y,<l>,9')p(ci>\y;9)d<!> 
is continuously differentiable on with respect to its first variable. 

4. For all I in N, j e £ [0, l],J2Zi H = 00 and IX i li < °°- 

For a convenient step sizes sequence je, the assumption (Al) is trivially checked in our model. 
A choice of step sizes sequence 7^ is presented in Section [H 

However, the simulation step of the SAEM algorithm, which performs the simulation of the 
non-observed vector <f> under the posterior distribution p(<j)\y\ 9) cannot be directly performed 
because the posterior distribution is only known up to a constant. In this case, Kuhn and Lavielle 



(la) propose to combine the SAEM algorithm with a Markov Chain Monte Carlo (MCMC) 
procedure for the simulation step. This version of the SAEM-MCMC algorithm can be used for 
the estimation of MNLMEMs. The MCMC procedure used in this case is detailed in section [ 



As proved by (|17l ). the convergence of the SAEM-MCMC algorithm is ensured under assumption 
(Al) and the following additional assumption: 
Assumption (A2): 

For any 9 in 6, we assume that the conditional distribution p(.\y\0) is the unique limiting 
distribution of a transistion probability lie, that has the following properties: 

1. For any compact subset V of 8, there exists a real constant L such that for any (9, 6') in 

V 2 

sup |n*(0'|0)-n fl /(0V)|<L||0-0 / ||. 

{$,4>'}e£ 

2. The transition probability Ug supplies an uniformly ergodic chain whose invariant proba- 
bility is the conditional distribution p(<j>\y;9), i.e. 

3K e eR+, 3p e e]0,l[ I WeN -p(-, ■\y;d)\\ T v < C eP l e , 
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where || • \\tv is the total variation norm. Furthermore, 

C = sup Ce < oo and p = sup pg < 1 . 
eee flee 

3. Function S is bound on £. 

At iteration £, the S-step of the SAEM-MCMC algorithm consists thus in simulating <fy™ with 
the transistion probability Hg (<jf>^ -1 ) \d(j)^) . 

The assumption (A2.2) is the most delicate to check, and it depends on the choice of the 
MCMC algorithm. This is detailed in the next subsection after presenting the MCMC procedure. 

In practice, the SAEM algorithm being a stochastic algorithm, there exists no deterministic 
convergence criterion which could be used to stop the iterations of the algorithm as soon as the 
convergence is reached. Therefore, we recommend to implement the SAEM algorithm with a 
sufficiently large number of iterations and to graphically check the convergence by plotting the 
values of the SAEM estimates obtained along iterations versus the iterations. Such a figure is 
described in Section [H 

3.2 MCMC algorithm for the simulation step 

Let us detail the simulation step of the SAEM-MCMC algorithm, which performs the simula- 
tion of the missing data 4> through a Markov chain which has p(4>\y; 9) as unique stationary 
distribution. For subject i, by Bayes formula, this conditional distribution is proportional to 

K 

p{<f>i\yi\0) tx Y[p{yik\0ik;d)p(<t>i;d)- 

k=l 

We propose to use a Metropolis-Hastings (M-H) algorithm to simulate this Markov chain. 
Let us recall the principle of this algorithm. At iteration r of the M-H algorithm, given the 

(r) 

current value of the Markov Chain, the M-H algorithm proceeds as follows: 

1. Simulate 4>1 with a proposal distribution q(-,(f>\ r ') 

2. Compute the acceptance probability 

a( ,c at), = nf=iP(yifcl# fc ;%(ff;g) <mAh 



li 



3. Simulate u with a uniform distribution U[0, 1] 

4. Update the Markov chain with 



if u<a(0?,4 r) ) 



else 



The convergence of the M-H algorithm strongly depends on the choice of the proposal distribu- 
tion q. The convergence is ensured for some proposal distributions such as independent (<?(•, <pf ) 
independent of or symmetrical ( ff (, = ,(^, 0) Proposals Q). These proposals are 
detailed below. Given the dimension of 0, we also consider a Metropolis-Hastings- Within-Gibbs 
algorithm, combining both Gibbs algorithm and M-H procedure. The advantage of the Gibbs 
algorithm is to reduce the multi-dimensional simulation problem to the successive simulations of 
one-dimension vectors. Finally, at iteration I of the SAEM algorithm, given the current estimate 
9t, we combine the three following proposal transitions: 

1. the prior distribution of <j>i, that is the Gaussian distribution M(pt+ fig, Yi), corresponding 
to an independent M-H algorithm, 

2. the multidimensional random walks N{4>f~ X \ pTg) (symmetric proposal), where p is a 
scaling value chosen to ensure a satisfactory acceptation rate, namely around 30% as 
proposed in ifisi . 

3. a succession of Kp unidimensional Gaussian random walks (symmetric proposal), i.e each 
component of (pi is successively updated, leading to a Metropolis-Hastings- Within-Gibbs 
algorithm, 

where Tg is equal to 



Given the proposal distributions, as previously detailed, and using the theoretical convergence 



results proposed in (|18l ). this hybrid Gibbs algorithm converges and generates an uniformly 
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ergodic chain with p((p\y, 9) as the stationary distribution. Consequently, by applying the con- 
vergence theorem of Kuhn and Lavielle ifl^ l and under assumptions (Al) and (A2), we prove that 
the estimate sequence (9i)i>o produced by the extended SAEM algorithm converges towards a 
(local) maximum of the likelihood p(y; 9). 

In practice, the convergence of the MCMC algorithm is difficult to verify. As in Bayesian 
inference, the only convergence criteria existing for MCMC procedure are graphical criteria. We 
have to check if the estimate sequence explores a sufficiency large domain of the Markov chain. 
A convergence figure is presented and commented in section 31 

3.3 Estimation of the Fisher information matrix and the likelihood 

To perform statistical tests such as Wald test or likelihood ratio test, we propose estimators of the 
Fisher information matrix and the likelihood, respectively. As the Fisher information matrix has 
no closed form in MNLMEMs, we propose to approximate it by the Fisher information matrix 
of the multi-level linear mixed model deduced from the MNLMEM after linearization of the 
function / around the conditional expectation of the individual parameters (E(<pi\y; 9), 1 < i < 
n). The computation of this linearized Fisher information matrix is direct and does not need 
any approximation. 

The estimation of the likelihood of the MNLEM is based on an Importance Sampling pro- 
cedure, as proposed by Samson et al. for NLMEMs. The Importance Sampling procedure 
has been introduced to approximate the integral of the likelihood with a smaller variance than 
with other Monte Carlo methods. In this case, an estimate of the contribution p(yi',9) of the 
individual i to the likelihood is 



/-i 'j-^'i ) 



where )t=i,...,T are simulated using the individual instrumental distribution As an in- 
dividual instrumental distribution we propose a Gaussian approximation of the individual 
conditional posterior distribution p(4>i\yi] 9). 
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4 Simulation study: a PK example 



4.1 Simulation settings 

The objective of this simulation study is to illustrate the main statistical properties of the 
extended SAEM algorithm (bias, root mean square errors, group comparison tests) and to 
compare them to the FOCE algorithm. We do not use the AGQ algorithm, since procedure 
NLMIXED does not succeed, in practice, in estimating such complex variance models. 

We use the PK data of orally administered theophyllin to define the population model for the 
simulation study. These data are classical ones in population pharmacokinetics, often used for 



software evaluation l|2lh . We assume that concentrations can be described by a one-compartment 



model with first order absorption and first order elimination: 



where D is the dose, V is the volume of distribution, K a is the absorption rate constant and CI is 
the clearance of the drug elimination. These parameters are positive and distributed according to 
a log-normal distribution. Thus, 4> nas the following components: <j> = (l°g V, log K a , log AUC), 
with AUC = D/Cl. We assume identical sampling times for all subjects: for all i in 1, . . . ,n, 
k = 1,2, tijk — tj for j = 1, . . . , J. Additive Gaussian random effects are assumed for each 
parameter with a diagonal covariance matrix fl and a a diagonal covariance matrix '5. Let 
uj 2 = (wy,w|- ,uj\ uc ) and ip 2 — (ipy,tpK i^auc) denote the vector of the variances of the 
random effects. A combined error model is assumed by setting g(t, <j>) = 1 + f(t, <j>). 

We set the dose for all subjects to the value of 4 mg. For all the parameters, the values are 
those proposed by Panhard and Mentre j^): logV = —0.73, logi^a = 0.39 and log AUC — 4.61, 
J\r = 0.01, uj 2 Ka = 0.04, uj 2 auc = 0.04, Vy = 0.0025, i? Ka = 0.01, ip 2 AUC = 0.01 and a 2 = 0.01. 
We generate n = 24 and n = 40 total numbers of subjects with J — 10 blood samples per 
subject, taken at 15 minutes, 30 minutes, 1, 2, 3.5, 5, 7, 9, 12 and 24 hours after dosing. The 
individual data of one simulated trial are displayed in Figure [TJ 

[Figure 1 about here.] 
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4.2 Evaluation of estimates 

Our aim is to evaluate and compare the estimates produced by the extended SAEM algorithm 
with those produced by the nlme function of the R software. We fit the simulation model and 
compute the relative bias and relative root mean square error (RMSEs) for each component of 
9 from 1000 replications of the two trials described below (n = 24 and n = 40 total number of 
subjects). 

The simulation model includes a treatment effect on all components of 9. We test the null 
hypothesis {P\ og Auc — 0} using the Wald test. We also fit the model where the treatment effect 
on log AUC is not estimated, and test the same null hypothesis using the Likelihood Ratio Test 
(LRT). 

The SAEM algorithm is implemented with 500 iterations. During the first 200 iterations, a 
constant step size 7^ = 1 is chosen, in order to let the Markov chain explore the parameters do- 
main. Then during the last 300 iterations, the stochastic approximation scheme is implemented 
with a step size equal to = e _\oQ at iteration I. This choice of step size sequence verifies 
convergence assumption (Al.l). The evolution of each SAEM parameter estimates is plotted 
against iterations (logarithmic scale) on Figure El During the first iterations of the SAEM 
algorithm, the estimate sequences explore randomly some neighborhoods of the initial values, 
through the Markov chain simulation. In particular, these behaviors are clearly visible for the 
fixed effect parameters (p, and (3). After 200 iterations, the estimates converge then rapidly to a 
neighborhood of the maximum likelihood, due to the stochastic approximation scheme. In this 
example, the iteration number has been chosen such that the convergence is clearly attained 
before the last iteration. 

[Figure 2 about here.] 

The relative bias and RMSEs obtained on the 1000 datasets with n — 24 and n — 40 subjects 
are presented in Table [TJ 

[Table 1 about here.] 

The bias and the RMSEs of the fixed effects (fi) are small with the SAEM algorithm and are 
almost half of those obtained with nlme (especially the RMSEs). The bias and RMSEs of the 
unit effect {(}) are small and on the same order with both methods. For the between-subject 
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variability parameters (O), the bias are reduced with SAEM, while the RMSEs are of the same 
order with both methods. For the within-subject variability parameters the bias and the 
RMSEs are satisfactory, and on the same order with both methods. The bias and RMSE for a 2 
are small and satisfactory for both methods. 

The type I error of the Wald test and of the LRT are evaluated on the same 1000 datasets. 
For n = 24, the type I errors are 6.0% and 6.5% for SAEM and nlme, respectively, for the Wald 
test, and 4.6% and 5.6% for SAEM and nlme, respectively, for the LRT. For n = 40, the type 
I error are 5.6% and 5.4% for SAEM and nlme, respectively, for the Wald test, and 5.8% and 
5.2% for SAEM and nlme, respectively, for the LRT. 

5 Application to the population pharmacokinetics of atazanavir 
with tenofovir 

5.1 Study population: ANRS 107 - Puzzle 2 study 

The Puzzle 2 - ANRS 107 trial was a randomized open-label, multiple-dose study supported 
by the French Agence Nationale de Recherche sur le Sida (ANRS) with HIV-infected patients 
in treatment failure with their antiretroviral therapy. Patients were randomized to receive for 
the first two weeks either their unchanged treatment with Pis and nucleoside reverse transcrip- 
tase inhibitors (NRTIs) (group 1) or unchanged treatment with NRTIs in combination with 
atazanavir (300 mg QD) plus ritonavir (100 mg QD) as a substitute for the failing PI therapy 
(group 2). From week 3 (day 15) to week 26, patients from either group switched to atazanavir 
(300 mg QD) plus ritonavir (100 mg QD) plus tenofovir DF at 300 mg QD and NRTIs selected 
according to the baseline reverse transcriptase genotype of the HIV isolated in each patient. 

In this paper, we analyze concentration data obtained from 10 patients from group 2 who 
were included and measured at 1, 2, 3, 5, 8, and 24 h after administering drug during each 
treatment period. Those exact dosing intervals were recorded. The objective of the substudy 
was to measure the pharmacokinetic parameters of atazanavir (administered with ritonavir) 
either before (day 14 [week 2]) or after (day 42 [week 6]) initiation of tenofovir DF in HIV- 
infected patients in order to detect pharmacokinetic interactions of tenofovir on atazanavir. 
Data of this substudy were analyzed using a nonlinear mixed effect model by Panhard et al fl) 
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and a significant effect of the co-administration of tenofovir on the pharmacokinetic parameters 
of atazanavir was found using the nlme function of the Splus software. 

The aim of the present analysis is to evaluate the effect of tenofovir on the PK parameters 
of atazanavir using the SAEM algorithm and the Wald test described in section 4. 

We use the one-compartment model with zero-order absorption proposed by Panhard et al. 
I23I to describe atazanavir concentrations: 

FD (, e-* r, «"-.(l-e-* T ')e-*"- T -A 

= WTi ( (1 - ^ + JT^wf j 

with F the bioavailability, V the volume of distribution of atazanavir, (T a ) the absorption 
duration, CI the elimination clearance and r the dosing interval (24 hours until the PK visit). 
The vector of the logarithm of the identifiable parameters is <f> = (\og(V/F), log(T a ), \og(AUC)). 
Data of both treatment periods are simultaneously analyzed using a NLMEM with two levels 
of variability (the between-patient and within-patient variabilities) on each PK parameter. A 
treatment effect is also estimated for each PK parameter, and a homoscedastic error model is 
used. 

5.2 Results 

Concentrations versus time are displayed in Figure [3j The SAEM algorithm succeeds in the 
estimation of all the parameters. The resulting parameters estimates are displayed in table 
El The SAEM algorithm estimates the AUC between-patient variability and the V/F and T a 
within-patient variabilities to 0.48, 0.69 and 0.19, respectively. The three other variances are 
estimated to 0. 

[Table 2 about here.] 

A significant effect of co-medication with tenofovir is found on log(j4£7C) (p=0. 00015) with the 
Wald test based on the SAEM algorithm. 

The individual prediction curves for the two periods are overlaid on the concentration data on 
Figure [3J for 10 patients. The goodness-of-fit plots (population and individual predicted concen- 
trations versus observed concentrations; standardized residuals versus predicted concentrations 
and versus time) are judged satisfactory, and are displayed in Figure [4l 
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[Figure 3 about here.] 
[Figure 4 about here.] 



6 Discussion 

The main original element of this study is the development of the SAEM algorithm for two- 
levels non-linear mixed effects models. We extend the SAEM algorithm developed by Kuhn and 



Lavielle (|16T I , which was not yet adapted to the case of MNLMEMs with two levels of random 



effects. This algorithm will be implemented in the 3.1 version of the monolix software, freely 



available on the following website: http: / /monolix. org The two levels of random effects are the 



between-subject variance and the within-subject (or between-unit) variance, with N subjects 
and K units, with no restriction on N or K . We show that the SAEM algorithm is split into two 
parts: an explicit EM algorithm and a stochastic EM part. The integration of the term p(b\<p; 9) 
in the likelihood results in the derivation of two additional sufficient statistics compared to the 
original algorithm. Furthermore it uses two intermediate quantities, the conditional expectations 
and variance of the between-subject random effects parameters b. The addition of higher levels 
of variability would therefore require other extensions of the algorithm. 

The convergence of the algorithm is monitored from a graphical criterion, as shown in Figl. 
An automatic implementation of that stopping criterion to optimize both the number of itera- 
tions and the stochastic approximation step should be considered in future work and extension 
of the Monolix software. 

The simulation study illustrates the accuracy of our approach. We show that the bias and 
RMSEs obtained by the extended SAEM algorithm are satisfactory for all parameters. The 
bias are reduced compared to those obtained with the FOCE algorithm implemented in the 
nlme function of the R software. The bias are especially divided by two for the fixed effects 
parameters with SAEM. Furthermore, whereas the nlme implementation of the FOCE algorithm 
does not always converge with both between- and within-patient variability on all parameters, 
the extended SAEM algorithm does. We develop the tests for a difference between the units, 
and we obtain type I errors close to the expected 5% for the Wald test and the LRT. 

The analysis of the pharmacokinetics of atazanavir with tenofovir in the Puzzle 2 - ANRS 
107 trial also demonstrates the ability of the extended SAEM algorithm to detect treatment 
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interaction on a real data set. When testing for an interaction of tenofovir on the PK of 
atazanavir, the impact of tenofovir on the absorption of atazanavir is confirmed; more precisely, 
a decrease of the AUC of atazanavir as shown by Panhard et al. (|23T I is found. 

We compare the extension of SAEM to the FOCE algorithm, that is the most popular method 
in population pharmacokinetics, which is one of the largest application fields of NLMEM. We try 
to use the NLMIXED procedure of SAS implementing Gaussian quadrature. However, procedure 
NLMIXED does not succeed, in practice, in estimating such complex variance models, on our 
simulated data, neither on the atazanavir dataset. The next step is a comparison with a Bayesian 
estimation of the parameters using Winbugs, which is beyond the scope of this paper. 

The next ambitious development would be an extension of the algorithm to the case of 
MNLMEMs with more than two levels of random effects, in order to analyze, for instance, 
genetic data where more than one generation of parents are taken into account. However, it 
would be difficult to develop a general estimation method since it strongly depends on the 
relation (linear or not) of the different levels of random effects. 

To conclude, the extended SAEM algorithm combines the statistical properties of an exact 
method together with a high computational efficiency. We thus recommend the use of this 
method in MNLMEMs. 
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A Index of notations 



Model notations 

i (i = 1, • • • , n): index of subject 

j (j = 1, • • • , rijfe): index of measurement of subject i for unit k 
k (k = 1, . . . , index of unit 

iijfe : measurement time in unit fc for subject i and measurement j 
Uijk- observation in unit k for subject i at timetyfc 

y = (Uijk)ijk- vector of the observations in the K units for all the n subject 
/ and g: non-linear functions linking observations to sampling times 
fak- P~ vector of the parameters of subject i for unit k 
fa = {fail ■ • ■ j 4>ik)'- pK -vector of individual parameter of subject i 

(p = {4>ik)i=l,...,n,k=l,...,K 

/u: p- vector of the mean of the individual parameters for k = 1 
Pk'- effect of the fcth unit in comparison to this first unit 
ft = [3k)'- vector of the unit effects 

bf. random effect of size p of subject i 

hi := fi + bi b = (b\, . . . , b n ) c^: random effect of size p of subject i and unit k 

Eijk'- measurement error 

fl: p x p between-subject covariance matrix 

p x p within-subject covariance matrix 
T: pK x pK covariance matrix of the individual parameters fa (i = 1, • • • , n) 
a 2 : variance of the measurement error 
9 = [ji, (3, Q, "J, a 2 ): vector of all the parameters 

Algorithm notations 

p(y,fab;0): likelihood of the complete data 
L c (y, fa b; 9): log-likelihood of the complete data 
p(fa b\y; 6'): posterior distribution of {<p,b) given (y;0') 
p(4>\y; 9'): posterior distribution of 4> given (y; 9') 
p(b\(f>;9): posterior distribution of b given (faO) 



22 



m((j),9): mean of p(b\<j>; 9) 

V(6): variance oi p{b\<f>; 9) 

Q(9\9>) :=E{L c {y,4>;b;6)\y;6')) 

R(y, 4>, 9, 9') := / L c (y, <f>, b; 9)p(b\cf>; 9>)db 

A(9,9 / ) and <S>(6,8'): functions of 6 x 6 

£: iteration number of the SAEM algorithm 

(j^f 1 : missing data simulated at iteration (, 

S(y, 4>): minimal sufficient statistics of the complete model of dimension d 
st + \\ stochastic approximation of E S(y,<p)\$t 
(ji)e>o'- step sizes sequence 
He: transition probability 

q{4>1,(j)l^): proposal distribution of the Metropolis-Hastings (M-H) algorithm 

<P1: candidate simulated using q{4>1,<j)-p) 

a(<f)°,<f>f^): acceptance probability of the M-H algorithm 

u: scalar sample generated with a uniform distribution U[0, 1] 

Cg and pg: constants involved in the proof of uniform ergodicity of the Markov Chain 
qf. instrumental distribution used for the estimation of p(yi]9) by Importance Sampling 
T: number of simulated set of parameters in Importance Sampling 



PK example notations 

D: drug dose 

V: volume of distribution of the drug (in liters) 
K a : absorption rate constant (in hours -1 ) 
T a : absorption duration (in hours) 
CI: clearance of elimination (in liters. hours -1 ) 
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Figure 1: Simulated theophyllin concentration data for 24 subjects during the first period (plain 
line) and during the second period (dotted line) 
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Figure 2: Evolution of the estimates, function of the iteration of SAEM algorithm (with a 
logarithm scale for the abscis axis). 
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Figure 3: Individual concentrations and individual predicted curves for the pharmacokinetics of 
atazanavir in 10 subjects: x and *, observations with and without tenofovir, respectively; dotted 
and plain line, individual predictions of the atazanavir pharmacokinetics with and without 
tenofovir, respectively. 
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Figure 4: Goodness-of-fit plots for atazanavir final population PK model: population (a) and 
individual (b) predicted concentrations (in ng/mL) versus observed concentrations (in ng/mL), 
standardized residuals versus predicted concentrations (in ng/mL) (c) and versus time (in hours) 
(d). 
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Table 1: Relative biases (%) and relative root mean square errors (RMSE) (%) of the esti- 
mated parameters evaluated by the extended SAEM algorithm and the FOCE algorithm (nlmc 
function) from 1000 simulated trials. 

n=24 subjects n=40 subjects 



Bias RMSE Bias RMSE 





SAEM 


nlme 


SAEM 


nlme 


SAEM 


nlme 


SAEM 


nlme 


V 


0.01 


0.53 


3.9 


6.4 


-0.06 


0.54 


2.91 


5.00 




0.48 


-1.48 


14.4 


24.4 


0.02 


-3.07 


10.79 


18.4 


AUC 


-0.08 


-0.20 


1.0 


1.5 


-0.11 


-0.24 


0.79 


1.13 


Pv 


-0.00 


-0.01 


3.6 


3.6 


-0.05 


-0.06 


2.83 


2.80 




-0.73 


-0.76 


14.2 


18.8 


0.24 


0.27 


10.73 


10.60 


PAUC 


0.02 


-0.02 


0.7 


1.1 


0.00 


1.14 


0.57 


1.14 




-5.13 


-5.92 


38.7 


38.4 


-3.45 


-4.28 


30.30 


30.37 


< 


-3.99 


-7.07 


42.4 


41.5 


-3.23 


-4.63 


33.49 


33.32 


U AUC 


-4.88 


-7.29 


34.5 


34.2 


-1.51 


-3.80 


27.41 


27.02 




-8.67 


-7.29 


69.4 


68.5 


-5.93 


-4.91 


58.78 


57.81 




-10.94 


-9.09 


73.5 


72.0 


-7.06 


-5.17 


62.00 


60.60 


i'AUC 


-5.37 


-5.37 


43.6 


42.6 


-4.92 


-5.79 


33.31 


32.47 


a 2 


-0.33 


-0.10 


7.7 


7.7 


0.28 


0.67 


6.03 


6.09 
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Table 2: Pharmacokinetic parameters of atazanavir (estimate and SE (%)) estimated with the 
SAEM 



Parameters Estimate SE (%) 



log(VVF) (L) 


4 


.01 


5. 


.79 


log(T a ) (h) 


1 


,36 


6. 


.72 


log(AUC) (ng.mL-^h) 


10 


.67 


1 


.61 


0V/F 





.12 


267 


.43 







.33 


45 


.03 


Pauc 


-0 


.38 


25 


.31 


^V/F 











LUT a 











UAUC 


0. 


.48 


25 


.48 


tpV/F 





.55 


28 


.30 


4>T a 





.16 


35 


.76 


4>AUC 











a (ng.mL^ 1 ) 


732 


.29 


8. 


.40 
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